Keyword Spices: A New Method for Building Domain-Specific Web Search Engines

نویسندگان

  • Satoshi Oyama
  • Takashi Kokubo
  • Toru Ishida
  • Teruhiro Yamada
  • Yasuhiko Kitamura
چکیده

This paper presents a new method for building domain-specific web search engines. Previous methods eliminate irrelevant documents from the pages accessed using heuristics based on human knowledge about the domain in question. Accordingly, they are hard to build and can not be applied to other domains. The keyword spice method, in contrast, improves search performance by adding domain-specific keywords, called keyword spices, to the user’s input query; the modified query is then forwarded to a general-purpose search engine. Keyword spices can be effectively discovered automatically from web documents allowing us to build high quality domain-specific search engines in various domains without requiring the collection of heuristic knowledge. We describe a machine learning algorithm, which is a type of decision-tree learning algorithm, that can extract keyword spices. To demonstrate the value of the proposed approach, we conduct experiments in the domain of cooking. The results confirm the excellent performance of our method in terms of both precision and recall.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Refinement for Domain-Specific Web Search

The expansion of the Internet and the number of its users has raised many new problems in information retrieval. The most common way to find information in the web is using web search engines. However, gathering information from the web is a difficult task for a novice user even if he uses the search engines. The user must have experience and skill to find the relevant pages from the large numb...

متن کامل

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

A New Hybrid Method for Web Pages Ranking in Search Engines

There are many algorithms for optimizing the search engine results, ranking takes place according to one or more parameters such as; Backward Links, Forward Links, Content, click through rate and etc. The quality and performance of these algorithms depend on the listed parameters. The ranking is one of the most important components of the search engine that represents the degree of the vitality...

متن کامل

Understanding Users Intent by Deducing Domain Knowledge Hidden in Web Search Query Keywords

Search Engines are used by people on a daily basis to retrieve information from the web. When an ambiguous word is present in a query, specific sense of the keyword is not considered during the search process. Search engines return a large amount of web pages as results from all the possible contexts. Users tend to browse only few pages. Improving quality of retrieved results is a challenge and...

متن کامل

An Efficient Approach for Keyword Selection; Improving Accessibility of Web Contents by General Search Engines

General search engines often provide low precise results even for detailed queries. So there is a vital need to elicit useful information like keywords for search engines to provide acceptable results for user’s search queries. Although many methods have been proposed to show how to extract keywords automatically, all attempt to get a better recall, precision and other criteria which describe h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001